Kinetics of viral self-assembly: the role of ss RNA antenna 
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A big class of viruses self-assemble from a large number of identical capsid proteins with long 
flexible N-terminal tails and ss RNA. We study the role of the strong Coulomb interaction of positive 
N-terminal tails with ss RNA in the kinetics of the in vitro virus self-assembly. Capsid proteins stick 
to unassembled chain of ss RNA (which we call "antenna") and slide on it towards the assembly site. 
We show that at excess of capsid proteins such one-dimensional diffusion accelerates self-assembly 
more than ten times. On the other hand at excess of ss RNA, antenna slows self-assembly down. 
Several experiments are proposed to verify the role of ss RNA antenna. 



Viruses self-assemble in host cells from identical capsid 
proteins (CPs) and their genome which in many cases is 
a long single stranded (ss) RNA. Icosahedral viruses are 
formed from 60T CPs for only certain triangulation num- 
ber T such as 1, 3, 4, or 7, etc The Coulomb interac- 
tion between CP and ss RNA plays an important role in 
their self-assembly 0, H, 0, [f| . Two recent papers [1, 0] 
emphasized that CPs of a big class of T = 3 and T= 4 
viruses have long flexible N-terminal tails. They explored 
the role played in the energetics of the virus structure by 
the Coulomb interaction between the brush of positive 
N-terminal tails rooted at the inner surface of the capsid 
and the negative ss RNA molecule (see Fig. [T^). It was 
shown [7] that virus particles are most stable when the 
total length of ss RNA is close to the total length of the 
tails. For such a structure the absolute value of the total 
(negative) charge of ss RNA is approximately two times 
larger than the charge of the capsid. This conclusion 
agrees with available structural data. (Similar result was 
obtained earlier Q assuming that the positive charge of 
CP is smeared on the inner surface of the capsid) . 

In this paper we continue to deal with electrostatic in- 
teraction of N-terminal tails and ss RNA, but switch our 
attention from the thermodynamics to the kinetics of in 
vitro self-assembly. Most of papers on in vitro kinetics 
study self-assembly of an empty capsid at much higher 
than biological concentrations of salt, where the Coulomb 
repulsion of capsid proteins is screened and hydrophobic 
interactions dominate [g, l2| ■ In Ref. [9( one can clearly 
discriminate the initial nucleation "lag phase", followed 
by the "growth phase", where the average mass of the 
assembled particles linearly grows with time. The re- 
cent study of the kinetics of self-assembly with ss RNA 
genome emphasizes that CPs stick to ss RNA before the 
assembly [H El, so that a virus is assembled actually 
from the linear CP-RNA complex. Not much is known 
about the nucleation and growth phases of such assembly. 

The goal of this paper is to understand the role of the 
large length of ss RNA in kinetics of self-assembly at 
biological salt concentrations. We assume that after nu- 
cleation (for example, at one end of ss RNA) the capsid 
growth is limited by CP diffusion. We calculate the ac- 
celeration of self-assembly, which originates from the fact 
that due to the Coulomb interaction of N-terminal tails 
with ss RNA, CPs stick to ss RNA and slide on it to the 



assembly site. In this case, ss RNA plays the role of a 
large antenna capturing CPs from the solution and lead- 
ing them to the assembly site. Figure [TJa illustrates this 
process. We show below that for a T=3 virus this mech- 
anism can accelerate self-assembly by approximately 15 
times. 




FIG. 1: (color online) (a) A blowup view from the inside of the 
virus. The brush of positive N-terminal tails (red/dark gray 
line) is rooted at the inner surface of the capsid (blue/light 
gray block). The ss RNA (green/gray line) strongly interacts 
with the tails and glues all the CPs together, (b) Schematic 
model of the capsid self-assembly. The unassembled ss RNA 
makes an antenna of size R for the one-dimensional pathway 
of the CPs towards the capsid assembly site at the capsid 
fragment (dashed circle with radius r of the size of a CP. 



We consider a dilute solution of virus CPs with 
molecules of its ss RNA genome. For the most of 
this paper we assume that concentrations of the protein 
c ~ 2Mcr, where cr is the concentration of ss RNA 
and M is the number of proteins in the assembled virus 
(for T=3 viruses M = 60T = 180). In this case there 
are enough proteins in the system in order to assemble 
the virus around each ss RNA molecule and c changes 
weakly in the course of assembly. Viruses, however, self- 
assemble only when the concentration c of CP is larger 
than some threshold concentration c\ [4J] , which is similar 
to the critical micelle concentration for the self-assembly 
of surfactant molecules The critical concentration 
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ci can be estimated as 

ci « -exp[-(e e + Cp)/kBT], (1) 

where v is the CP volume, e e is the absolute value of the 
electrostatic adsorption energy of the CP N-terminal tail 
to ss RNA, and e p is the absolute value of the CP-CP 
attraction energy in the capsid (per CP). Both e e and e p 
can be of the order of 10 fc^T, so that the critical con- 
centration ci can be very small. In this paper we always 
assume that c > As shown in Ref. [7], in a par- 
tially assembled capsid, CP sticks to a piece of ss RNA 
of the length equal to the tail length L (Fig. QJi) . A par- 
tially assembled capsid with m < M CPs encapsulates 
the length mL of ss RNA . To continue this process the 
next (m + l)th CP should attach itself to the partially 
assembled capsid at the site, where ss RNA goes out of 
the capsid (see Fig. [Tp) and this CP gets more nearest 
neighbors. We call this slowly moving site "the assembly 
site" . It has the size of the order of the size r of CP (see 
Fig. rjb). 

CPs diffuse to the assembly site through the bulk wa- 
ter. For c 3> ci one can neglect the dissociation flux from 
the assembly site. In this case the net rate of assembly 
(the number of CP joining the capsid per unit time) is 
equal to the rate at which diffusing CP find the absorbing 
sphere with the radius r. It is equal to the Smoluchowski 
three-dimensional reaction rate [l3j 

J 3 = 4nD 3 rc, (2) 

where D 3 is the diffusion coefficient of CP in water. The 
rate J3 as a function of CP concentration c is plotted in 
Fig. [21 by the dashed straight line. 
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FIG. 2: Schematic plot of the diffusion limited self-assembly 
rate J as a function of the protein concentration c. The full 
line is for the sliding of capsid proteins on ss RNA. The rate for 
the slower three-dimensional diffusion is shown by the dashed 
line. 

Our main idea is that the long chain of yet unassem- 
bled ss RNA outside of the capsid provides an additional 
route for the diffusion of CPs to the assembly site, in 
analogy to the well-known faster-than-diffusion locating 
of the specific site on DNA for a protein |l4 , Il5j . The 
dramatic enhancement of the assembly rate is achieved 



because, due to the Boltzmann factor exp[e e / ksT], the 
three dimensional concentration of CP on unassembled 
chain of ss RNA is larger than the bulk concentration c. 
This concentration can be estimated using the cylinder 
with cross-section v 2 ^ 3 build around RNA as the axis: it 
is equal to the number of CPs per unit length of ss RNA 
divided by v 2 / 3 . At large distances the one-dimensional 
flux of CP sliding on the ss RNA should be balanced by 
the three dimensional diffusion flux of CP to the ss RNA. 
This balance determines the radius £ of the sphere around 
the assembly site at which two fluxes match each other 
and the crossover between three-dimensional and one- 
dimensional diffusions of CP takes place. The ss RNA 
coil inside this radius is called antenna. 

The maximum possible antenna size is the character- 
istic size R ~ (pJzfe) 1 / 2 of the unassembled portion of 
ss RNA with length Jz? e = _Sf — mL. (Here we assume 
the ss RNA is a flexible Gaussian coil with the persis- 
tence length p ~ 2b ~ 1.5 nm, where b ~ 0.7 nm is the 
monomer size, and do not account for the excluded vol- 
ume interaction.) In the case when £ = R, the whole 
ss RNA adsorbs CPs arriving by three-dimensional diffu- 
sion and provides a path of fast one-dimensional diffusion 
to the assembly site (See Fig. [I]). As a result, in this case 
the size R replaces the protein size r in Eq. ([2|) leading 
to a much faster rate 

J = 4:TtD 3 R Ci (3) 

which is shown in Fig. [2] by the part of the solid line 
parallel to the dashed one. Equation ([3|) is correct until 
CPs adsorbed on the unassembled chain of ss RNA are 
still sparse and do not block each other's diffusion on ss 
RNA. Let us use the notation c 2 for the concentration 
c, where the antenna becomes saturated by CPs and the 
dependence of the self-assembly rate J on c saturates 
roughly speaking at the level J m ax = AnD^r /v, which is 
the Smoluchowski rate J3 at c ~ 1/v (see the solid line 
in Fig. It was shown in Ref. that if £ < R 

c 2 = - cxp[-e e /fc s T] « ci exp[e„/fc B T]. (4) 

v ' 

We see that the largest enhancement R/r of the self- 
assembly rate J can be achieved in the range of relatively 
small CP concentrations c% <C c <C c 2 . For a typical 
T=3 virus the ss RNA genome consists of 3000 bases, so 
that the length Jz? ~ 2100 nm and R ~ 60 nm. Using 
r ~ 4 nm, we arrive at the acceleration factor R/r ~ 15. 
One can calculate the assembly time t q limited by diffu- 
sion. As we said above for c ~ 2Mc^, the concentration 
of proteins c can be regarded as a constant. Thus, the 
assembly time with the help of antenna r a is given by 




4ircD 3 [(M - m)Lp] 1 / 2 AncD^Lp) 1 / 2 ' 



(5) 

while according to Eq. @ , the assembly time without an- 
tenna is simply tq = M/ (A-KcDgr). Since {Lp) 1 / 2 ~ 4 nm, 
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we can neglect the difference between (Lp) 1 / 2 and r, and 
arrive at the assembly time with the help of antenna 
M 1//2 « 14 times shorter than tq. 

Strictly speaking, these estimates are correct only for 
self-assembly with a homopolymcric ss RNA or a syn- 
thetic negative polyelectrolyte Q. For these cases, a 
small additional acceleration by a factor 2 or 3 can be 
provided by the excluded volume effect. On the other 
hand, the native ss RNA is more compact than gaussian 
one due to hydrogen bonds forming hairpins and thus 
the estimated acceleration rate can be reduced by a fac- 
tor between 2 or 3. Above we for simplicity replaced £ 
by its maximum value R. The actual calculation of the 
antenna size £ can follow the logic of the scaling esti- 
mate for the search rate of the specific site on DNA by a 
protein in Ref. [15]. In our case, the assembly site plays 
the role of the target site (diffusion sink) for the protein, 
the unassembled chain of ss RNA plays the role of DNA 
and the Coulomb attraction energy of N-terminal tails to 
the unassembled ss RNA is analogous to the non-specific 
binding energy of diffusing protein on DNA. One may 
argue that the virus self-assembly problem is different, 
because ss RNA plays a dual role. It is not only an an- 
tenna for the sliding CPs, but ss RNA itself also moves to 
the assembly site, where it gets packed inside the capsid 
(each newly assembled CP consumes the length L of ss 
RNA). However, for a small concentration c in the range 
Ci < c < C2, where the unassembled ss RNA chain is 
weakly covered by CPs, the velocity of ss RNA drift in 
the direction of assembly site is much smaller than the 
average velocity of CP drift along ss RNA. Thus, for the 
calculation of the assembly rate at a given length of the 
unassembled ss RNA chain we can use the approximation 
of static ss RNA. This brings us back to the problem of 
proteins searching for the specific site on DNA 15]. Note 
that this means that the idea of self-assembly from the 
prepared linear ss RNA-protein complex [l(J El is liter- 
ally correct only at c > C2- 

It is shown in Ref. that for a flexible ss RNA, the 
antenna size reads £ ~ btyd) 1 ' 3 , where y = exp(e e /fcsT), 
d = D1/D3 and D± is the one-dimensional diffusion co- 
efficient of the protein sliding on ss RNA. This result 
remains correct as long as the antenna size £ is smaller 
than the ss RNA coil size R. The energy e e of adsorp- 
tion of the N-terminal tail with approximately 10 positive 
charges on ss RNA can be as large as lOfceT. For d = 1 
we get £ ~ 30 nm, while R ~ 60 nm. Thus, a simple 
estimate leads to the antenna length £ somewhat smaller 
than R. 

There are, however, two reasons why £ may easily reach 
its maximum value R. First, some viruses self-assemble 
from dimcrs H,[Tl|. Naturally dimers with their two pos- 
itive tails bind to ss RNA with the twice larger energy 
2e e . This easily makes £ > R. ii) The theory of Ref. [15| 
assumes that a sliding protein molecule has only one pos- 
itive patch, where it can be attached to a double helix 
DNA. Even if two distant along the chain pieces of DNA 
come close in the three-dimensional space, such protein 



can not simultaneously bind both pieces and, therefore, 
can not crawl between them without desorbing to water 
and losing the binding energy — e e . For a globular protein 
this is quite a natural assumption. On the other hand, 
for CP attached to ss RNA by a flexible N-terminal tail, 
the tail can easily cross over (crawl) between the two ad- 
jacent pieces of the same ss RNA molecule losing only 
small fraction of the energy — e e . This should lead to 
faster protein diffusion on ss RNA and may easily push 
£ up to R. 

Let us discuss ideas of three in vitro experiments, 
which can verify the role of ss RNA antenna in virus self- 
assembly. In the first experiment, one breaks ss RNA 
molecule into K ^> 1 short pieces of approximately equal 
length. It was shown [H|,[l7[ that the assembly is possible 
even when K ~ M/2, because in order to glue CPs short 
ss RNA should bind two N-terminal tails of neighbor- 
ing proteins in the capsid. Virus assembly from short ss 
RNA pieces goes consecutively through two different dif- 
fusion limited stages. In the first stage, capsid fragments 
(CFs) made of Al/K proteins self-assemble on each short 
ss RNA molecule. According to Eq. [5] the time neces- 
sary for this stage is proportional to (M/K) 1 / 2 and is 
much shorter than the assembly time r a with the intact 
ss RNA. The second stage, where CFs aggregate to form 
the whole capsid takes much larger time r as (s stands for 
short). In order to calculate r as we assume that when 
two CFs with n CPs each collide, they can relatively fast 
rearrange their ss RNA and CPs in order to make one 
bigger CF with 2n CPs. We also assume that at any 
time t all CFs are approximately of the same size n(t). 
Then the concentration of such CFs is c(n) = cuM/n(t), 
where cr is the concentration of original intact ss RNA. 
Therefore, the time required for doubling of a CF can be 
estimated from Eq. @ 
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ATiDz{n)r{n)c{n) A^Ds{n)r{n)CRM 



(6) 



where D$(n) and r(n) are diffusion coefficient and ef- 
fective radius of a CF with n CPs. Since the diffusion 
coefficient is inversely proportional to the droplet radius, 
the product D^(n)r{n) — kBT/Qnrj 1 (where r\ is the wa- 
ter viscosity), is the same constant as D^r for a single 
protein. One collision of droplets transfers n CPs to the 
growing CF. Therefore, the average time needed to add 
one CP to the growing CF t\ = r(n)/n = l/AirD^rMcR 
does not depend on n. In other words, the number n(t) 
of CP per CF increases at a constant rate. The assembly 
ends when n reaches M. Therefore, the assembly time is 
given by 
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AttcrD^t 



(7) 



Above equation shows the assembly time depends on 
Mcr which stands for the concentration of CP involved 
in the CF aggregation. However r as has no dependence 
on K . Comparing Eqs. [5] and we obtain that at 
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c - 2Mc R 

Z°i „ M 1 / 2 ^^! „ m 1 ' 2 » 1. (8) 

T a r 

We see that the virus assembly time with short ss RNA 
pieces is much larger than that for the intact ss RNA. 
This happens due to the breaking of big antenna of the 
original ss RNA. 




♦x 



FIG. 3: Self-assembly times plotted schematically as a func- 
tion of x = c/Mcr. to = M / (AncDzr) is the assembly time 
without the effect of ss RNA at x > 1. The dark and gray 
lines correspond to intact ss RNA and short RNA pieces re- 
spectively. 

In the second experiment, we return to the intact ss 
RNA and discuss what happens when we vary relative 
concentrations of CP and ss RNA x = c/Mcr, for ex- 
ample, keeping c = const and changing cr. Until now 
we assumed that x ~ 2, i.e. we have marginally more 
proteins than it is necessary to assemble a virus at every 
ss RNA. If x 3> 1 the assembly time T a is practically the 
same as that at x ~ 2 and is given by Eq. [Sj Let us 
now consider much larger cr, for which x <C 1. Here 
situation changes dramatically. There are two assembly 
stages. In the first stage, a CF is assembled with part 
of each ss RNA molecule, leaving the rest of the ss RNA 
molecule as a tail. This assembly uses up all the proteins 
and stops, when all CFs are still much smaller than the 
complete capsid and their ss RNA tails are long (see, for 
example, Fig. QJ)). This state is essentially a kinetic trap. 
If energies e e and e p are much larger than fc^T, CFs on 
different ss RNA molecules can not exchange CPs trough 
the solution or via collision of their ss RNA tails. They 
can grow only via CF-CF collisions, while merging on one 
ss RNA and releasing the other empty one. We explained 
above, at x > 1 (CPs are in excess), CFs without RNA 
tails produce a capsid during time given by Eq. [7J On 



the other hand, at x < 1, only occupied by CP ss RNA 
molecules take part in the aggregation and in order to 
get the assembly time, cr in Eq. [7J should be replaced 
by c/M, which does not depend on x. However, due to 
the long ss RNA tail, a CF diffuses slower than it does 
without a tail. The time r a (x) grows substantially with 
decreasing x, because with more ss RNA, the initial CFs 
have fewer CPs and longer ss RNA tails. This time sat- 
urates at x ~ 1/M, where c = cr and each CF has only 
one protein and the longest ss RNA tail. Thus, a long 
antenna accelerates assembly at x > 1 and decelerates it 
at x < 1. This behavior of r a (x) is schematically plotted 
in Fig. EJ 

In the third experiment we can combine the first two 
and break ss RNA into pieces at several different values 
of a;. At x < 1 a CF gets a shorter tail of ss RNA 
and larger mobility, so that assembly is faster than for 
intact ss RNA. When x > 1, the assembly time grows 
according to Eq. (7J) with decreasing cr (increasing x). 
This is because the smaller the ss RNA concentration, 
the harder for the CFs to collide with each other and 
form larger CFs. In other words, kinetics is determined 
only by CPs already assembled in CFs and their number 
decreases with growing x. We illustrate such nontrivial 
role of broken ss RNA in Fig. [3J 

Now let us give some numerical estimates for ci, ci 
and To for the in vitro assembly. Using the radius of 
CP r ~ 4 nm, we obtain c\ ~ 0.1 nM and C2 ~ 1 fiM 
from Eqs. ^ and For c ~ 1 nM and the diffusion 
coefficient -D3 ~ 2 x 10~ 7 cm 2 /s, the assembly time To is 
about 10 min. At excess of CP, ss RNA antenna reduces 
it to r a ~ 1 min. At excess of ss RNA roughly speaking 
T a increases to 2tq. One can make T a even larger using 
much longer than native ss RNA. 

In conclusion, we studied the role played by unassem- 
bled tail of ss RNA, which we call antenna. We showed 
that one-dimensional diffusion accelerates the virus self- 
assembly more than ten times when proteins are in excess 
with respect to RNA. On the other hand when RNA is in 
excess long tail of ss RNA slows down the assembly. We 
discussed several experiments which can verify the role of 
antenna. Although in this paper we focus on viruses for 
which CPs have long positive N-terminal tails, our idea 
can be also applied to the case where a CP binds to ss 
RNA by its positive patch. Our ideas are applicable be- 
yond icosahedral viruses, for example, to the assembly of 
immature retro- viruses such as RS V or HIV [l(| EE E3 • 
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